Clinics register based HIV prevalence in Jimma zone, Ethiopia: applications of likelihood and Bayesian approaches

Background The distribution of HIV is not uniform in Ethiopia with some regions recording higher prevalence than others. However, reported regional HIV prevalence estimates mask the heterogeneity of the epidemic within regions. The main purpose of this study was to assess the district differences in HIV prevalence and other factors that affect the prevalence of HIV infection in Jimma zone, Oromia region of Ethiopia. We aimed to identify districts which had higher or lower than zone average HIV prevalence. Such in-depth analysis of HIV data at district level may help to develop effective strategies to reduce the HIV transmission rate. Methods Data collected from 8440 patients who were tested for HIV status in government clinics at the 22 Districts between September 2018 to August 2019 in Jimma zone were used for the analyses. A generalized linear mixed effects model with district random effects was applied to assess the factors associated with HIV infection and the best linear unbiased prediction was used to identify districts that had higher or lower HIV infection. Both likelihood and Bayesian methods were considered. Results The statistical test on district random effects variance suggested the need for district random effects in all the models. The results from applying both methods on full data show that the odds of HIV infection are significantly associated with covariates considered in this study. Disaggregation of prevalence by gender also highlighted the persistent features of the HIV epidemic in Jimma zone. After controlling for covariates effects, the results from both techniques revealed that there was heterogeneity in HIV infection prevalence among districts within Jimma zone, where some of them had higher and some had lower HIV infection prevalence compared to the zone average HIV infection prevalence. Conclusions The study recommends government to give attention to those districts which had higher HIV infection and to conduct further research to improve their intervention strategies. Further, related to those districts which had lower infection, it would be advantageous to identify reasons for their performance and may apply them to overcome HIV infection among residents in those districts which had higher HIV infection. The approach used in this study can also help to assess the effect of interventions introduced by the authorities to control the epidemic and it can easily be extended to assess the regions HIV infection rate relative to the rate at the national level, or zones HIV infection rate relative to the rate at a region level. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-021-06965-0.

Background Sub-Saharan African region contributes more than twothirds of the global Human Immunodeficiency Virus (HIV) burden, with 67.4% or 25.6 million HIV cases. The region is hardest hit by HIV in the world, followed by Asia and the Pacific, each with 5.8 and 2.2 million HIV cases, respectively. Of 690,000 of acquired immunodeficiency syndrome (AIDS)-related deaths in 2019, 300,000 and 140,000 occurred in the East and Southern Africa, and West and Central Africa, respectively [1]. An estimated 670,000 people in Ethiopia acquired HIV in 2019, and an estimated 12,000 people died of AIDS-related illness [2]. Of the people living with HIV in 2019, 15,000 people were newly infected cases. The overall national adult HIV prevalence in Ethiopia is estimated at 0.9%, whereas adult women and men prevalence rates are 1.2 and 0.6%, respectively. The figures are much higher in urban areas, about seven times higher than the rural HIV prevalence [3]. The HIV epidemiology of the country is also heterogeneous by geographic areas, where in 2017, Gambella region had the highest prevalence rate (4.8%) followed by the two city administrations Addis Ababa (3.4%) and Dire Dawa (2.5%). However, three fourths of people living with HIV in 2017 were from Amhara (30%) and Oromia (26%) regions, and Addis Ababa (18%) city administration [3]. In addition, it varied by age and socioeconomic status.
The estimated HIV prevalence rate in Ethiopia declined from 3.3% in 2000 to 0.9% in 2019, and AIDS-related deaths from 83,000 deaths in 2000 to 15,000 in 2019, thus the country being on the right track to deliver on its commitments. However, the country progress with the first 90% of the Joint United Nations Programme on HIV/AIDS (UNAIDS) 90-90-90 targets by 2020 was not achieved as it was expected [4], where 90-90-90 targets by 2020 means 90% of people living with HIV know their HIV status, 90% of people who know their HIV-positive status are accessing treatment and 90% of people on treatment have suppressed viral loads [5]. The transmission of HIV appeared to continue in the country, particularly among the urban population despite the government persistent efforts to halt the epidemic and HIV remained an issue of public health concern in Ethiopia [6].
The most common modes of HIV transmission are through heterosexual sex, men who have sexual encounters with other women and injecting drug users are also at higher risk of HIV transmission [7]. However, in Ethiopia the most common mode of transmission is through heterosexual sex. Indicators that are related to sexual behaviour risks for HIV infection in the country are age at sexual debut, non-marital and non-cohabiting sexual partnerships, unprotected sexual intercourse, marital status and education level [6]. It has been reported that young females tend to have early sexual debut as compared to males and large per cent of adults who had sexual intercourse with non-marital or non-cohabiting sexual partnerships not using a condom [6]. Studies in Sub-Saharan Africa have shown that HIV prevalence various by age, sex, place of residence and between countries [2,8].
The HIV distribution in Ethiopia is not uniform [6] with certain regions recording higher prevalence than others. The current national and regional HIV prevalence estimates mask this heterogeneity within the country. To design most effective strategies that help to reduce the HIV transmission rate, it is essential to have a more in-depth analysis of the epidemiological patters and risk factors of HIV in each zone or district in a region. Because targeting high-risk areas with effective control measures yield good results in controlling the pandemic [9]. The main objective of the current study was to assess the factors that affect the prevalence of HIV infection in Jimma zone, Oromia region of Ethiopia. Further, we were interested to assess the district differences in HIV prevalence in Jimma zone and to rank districts by their prevalence rates applying the best linear unbiased prediction (BLUP). Most of the available studies include only public health facilities in Jimma town (e.g. see [10][11][12][13][14]). However, information that shows the comparison of HIV distribution over district to district levels in Jimma zone is very limited. For example, a study conducted by [14] on HIV positive sero-status disclosure and its determinants among people living with HIV/AIDS following ART clinic in Jimma University Specialized Hospital indicates that age, sex, educational status and marital status had significant association with HIV sero-status disclosure. Since study was done only in one hospital, generalization of these findings for entire population of Jimma zone is not possible and comparing the distribution of HIV/ AIDS from district to district is difficult.
The rest of the paper is organized as follows. We first describe the data. Next, we specify and outline the likelihood and Bayesian techniques used for model estimation. The results from applying these methods on the study data are presented in Results section. Finally discussion on the results and limitations of the study, and

Study area and data
The data for this study was obtained from Jimma zone districts' Public Health center offices. The Jimma zone is in the Oromia regional state and located in the South-Western part of Ethiopia between Latitude 6 • and 9 • North and Longitude 34 • and 38 • East, and between altitude ranges of 880 to 3340 meters above sea level. The total coverage area of Jimma Zone is 15,568.58 km 2 [15]. When patients visit government clinics due to illness, not necessarily HIV related, they get orientation about HIV including means of transmission and advantage of knowing their HIV status by health professionals. Then they are asked if they are willing to do HIV test and those agreed to do the test fill a consent form. The clinics give pre-and post test counselling. All patients aged 15 years and older, who visited the clinics and tested for their HIV status in the 22 districts of Jimma zone during the periods September 2018 to August 2019 were included in this study. The data that were extracted from the patient register include patient HIV status (negative or positive), patient gender, age (15-19, 20-24, 25-49 or ≥ 50), marital status (single, married, divorced or widowed), educational level (no education, primary, secondary or superior), condom use (no or yes), religion (protestant, muslim, orthodox and other), occupation (no job, daily worker, farmer, merchant or government employee) and place of residence (rural or urban).

Ethical consideration
Permission of the study was obtained from Postgraduate research office of Jimma University, College of Natural Sciences. In addition, permission to use the data for this study was obtained from Jimma Zone Ministry of Health Office. The data provided to the authors do not contain any personal identifiers, therefore the anonymity of the patients were assured.

Generalized linear mixed model
As discussed earlier, HIV prevalence in Ethiopia varies by geographical location. Therefore, this study employed the logistic regression model with random effect to quantify the variation in HIV prevalence that is accounted for by the district variance. Let a binary outcome variable y ij denotes the jth patient HIV status in the ith district with probability p ij , where y ij = 1 for tested positive patient and y ij = 0 for tested negative patient. A logistic regression model with a random effect for the outcome y ij , i.e., a generalized linear mixed effects model (GLMM) is given by where g(·) is the link function, x ij = (1, x 1ij , . . . , x pij ) is vector of p explanatory variables or covariates measured on the j patient in the i district, β is vector of fixed regression coefficients or parameters, b i is a random effect varying over districts, n i is the number of patients in the i district and m is the number of districts, where for this study m = 22 . The patients living in different districts are likely to vary in their risk of HIV infection and the random effect b i enables us to include this unknown variation in the model, that is district random effects represent the differences on patients' HIV test results attributable to the districts but were not captured by any of the covariates x 1ij , . . . , x pij . Because b i is added to all n i patients for a district, we induce a positive correlation among the n i responses.
It is assumed that b i is independently and normally distributed with mean zero and variance is linked to the linear predictor η ij via a link function g(·) and the conditional distribution of y ij belongs to the exponential family. The conditional probability of y ij given the district-specific random effects b i is given by Assuming that the observations within a district i are independent given the random effects b i , the conditional probability of the response vector for the i th district y i = (y i1 , y i2 , . . . , y in i ) ′ is given by The marginal likelihood function of the ith district which is also the likelihood function of β and a variance of b i , σ 2 b is obtained by averaging over the distribution of b i , that is, . The total marginal likelihood function is the product of m terms in equation (2) and hence (1)  where y = (y ′ 1 , y ′ 2 , . . . , y ′ m ) ′ is the total vector of the responses. Observe that the conditional distribution of y i |b i is not normal, the marginal distribution generally does not have a closed form, hence the integral in Eq.
(2) must be approximated in order to run the statistical inference, including estimation of parameters. In this study, we have applied the adaptive Gauss quadrature [16] numerical integration technique to approximate the integral. The combination of Adaptive Gaussian Quadrature for numerical approximation of likelihood function and the Newton-Raphson method for optimization technique produce the most reliable results [17]. A likelihood ratio test was used to test for the associate between individual covariate and HIV prevalence. The asymptotic chi-square mixture distribution [18] test was applied to test H 0 : The district random effects were estimated using the best linear unbiased predictors (BLUP) procedure after controlling the covriates or fixed effects [19]. The BLUP method is efficient and predicted values obtained using this method are realised values of district random effects [20], therefore it yields the same ranking as true values of random effects [21].

Bayesian approach for GLMM
The Bayesian approach to GLMMs differs from likelihood methods that it treats all unknown parameters in the model as random variables and have probability distributions called posterior distributions, in contrast to how likelihood methods treat parameters as fixed constants. The likelihood of the model describes the data generating process given the parameters, and the prior usually reflects any previous knowledge about the model parameters. When the prior knowledge is scarce, vague or non-informative priors are assumed so that the posterior distribution is driven by the observed data [22]. The Bayesian aim is to estimate the joint posterior distribution and inference is conducted through the posterior distribution, which combines information from the probability of the data given the parameters, essentially via the likelihood and the prior distributions of the parameters.
Using the Bayes' theorem, the marginal likelihood can be expressed as [23] where f (y ij |β, b i ) denotes a probability mass function for HIV status of a patient. As discussed in the previous section, for GLMM this integral usually has no closed form. In such case, it is necessary to resort to other methods to estimate the posterior distribution or, alternatively, to draw samples from it. The classical approach for Bayesian inference is to use the Markov Chain Monte Carlo (MCMC) simulation techniques [24]. However, the MCMC is computationally expensive. Therefore, we have modeled the HIV prevalence using the integrated nested Laplace approximation (INLA) numerical method [25]. For more detailed discussion on the INLA computational approach see, for example, [23,25,26]. A Bayesian approach requires the specification of prior distributions for all the random elements of the model. For the GLMM in Eq. (1), it involves choosing priors for the regression coefficients and the hyperparameter of district random effects standard deviation (SD), σ b . In this paper, since empirical information on parameters, β and σ b or relevant to the study data is not available, non-informative priors were used [27]. Specifically, we have used non-informative priors, i.e. N(0, 1000) for the regression coefficients since this is common practice and it allows to compare Bayesian method with maximum likelihood method. Since the choice of priors can have an important impact on posterior distributions of the model parameters and model performance can be sensitive to the choice of the district-specific random effects variance priors [27], we have considered three priors or hyperpriors for σ b precision parameter τ b , which are based on the inverse gamma distribution [28] and selected the best for the current data using sensitivity analysis. These priors are The fitted models were compared using the Deviance Information Criterion (DIC) [31] and the widely applicable information criteria (WAIC) [32]. For the statistical software program implementation, R codes were written for the likelihood and Bayesian approaches. The GLMMs were fitted using the glmer() function from the lme4 R package [33], whereas the Bayesian analyses were done using the inla() function from the R-INLA package [34]. In addition, ggplot2 and lattice packages of R were used for the graphical outputs. The detailed results are presented in Results section.  Table S1). A large per cent (35.1%) of them in the age category 25-49. Over half of them (54.8%) lived in urban areas, only 6% were government employees, only 7.6% of them had above secondary schooling, i.e. superior education level, about 45% of them were muslims, more than a third (38%) were married, and almost half of them (50.4%) did not use condom during sex (Additional file 1: Table S1).
The overall HIV prevalence among tested patients at government clinics in the Jimma zone in the period between September 2018 to August 2019 was 22.1% in women and 24.3% in men (Table 1). In the age group 15-19 years, prevalence was 11.1% in women and 11.7% in men, whereas in the age groups 20-24, 25-49 and 50 years and older the HIV prevalence, respectively were 11.9% in women and 12.2% in men, 14.7% in women and 17.4% in men, and 6.6% in women and 7.2% in men. The HIV prevalence in Jimma zone increased with age, except for age group 50 years and older and was consistently higher among men across all age groups than among women ( Table 1). The HIV prevalence was high among married compared to other marital status groups, in those individuals completed primary schooling, daily laborers, among muslims, among urban residents and those who did not use condom during sex (Table 1).

Multivariable models
To identify factors associated with HIV infection, a multivariable logistic regression model with district random effects was applied on the data. The covariates were checked for multicollinearity using the variance inflation factor (VIF) before adding them to the model. None of these VIFs (the values are between 1.007 and 1.455) were greater than 5 suggesting the collinearity is not strong to affect the statistical inference in the analysis. To assess effect of a covariate on gender specific prevalence, the analyses were done for each sex separately in addition to analysis done using the combined data. The estimated variances of the district random effects were σ 2 b = 0.094, 0.087 and 0.080 for women, men and full data sets ( Table 2), and for the data sets, the asymptotic chi-square mixture distribution [18] test statistic for testing H 0 : σ 2 b = 0 against H 1 : σ 2 b > 0 take the value 19.86, 10.01 and 35.25 with p-value < 0.0001 , 0.0008 and < 0.0001 ( Table 2), respectively. The very small p-values strongly suggest a rejection of the null hypothesis H 0 : σ 2 b = 0 that no district-specific random effects should be included in the model. Therefore, these results imply the need for the district (cluster)-random effects in each model fitted to the data, which suggest that individuals with the same characteristics in different districts may have different HIV status in Jimma zone. The tests for fixed effects ( Table 3) show that except occupation across the three data sets and religion in men data set, the other fixed effects or covariates were significantly associated with odds of patients HIV infection because the 95% confidence intervals do not include 1. The results in Table 3 also show that the test on intercept, β 0 = 0 , is significant suggesting that a patient of 50 years of age or older, who was single, no formal education, muslim, had no job, used a condom during sex and lived in a rural area whose district had a random effect equal to zero had a log-odds of HIV infection different from zero.
Note that the odds ratios for the individual variables reported in Table 3 are conditional or cluster-specific measures of association. That is, they are interpreted as having an effect conditional on a district random effect being held constant [35]. Therefore, the interpretation of odds ratios are done here for within district comparisons, that is as district adjusted associations between patient characteristics and HIV prevalence.
Controlling for marital status, education, occupation, religion, place of residence and use of condom during sex, the odds of HIV infection was significantly higher in age group 15-19 years (aOR 1.734, 95% CI 1.190-2.527 for women; aOR 1.819, 95% CI 1.271-2.602 for men; and aOR 1.781, 95% CI 1.374-2.308) in the full data) compared to patients with age 50 years or older but who share the same district average risk, i.e. the same value of the random effect ( Table 3). Whereas the odds of HIV infection for women in age group 20-24 years was significantly lower than women patients with age 50 years or older and lived in the same district (aOR 0.707, 95% CI 0.514-0.971). When comparing patients who lived in the same district and who share identical values on the covariates age, education, occupation, religion, place of residence and use of condom during sex, the odds of HIV infection was 67.4 and 51.4% higher in married women and men compared with single women and men, respectively. Similar trend also observed among divorced and widowed, where the odds of infection among divorced women and men was 51.3 and 42.5% higher than single women and men, respectively; and among widowed it was 65.9 and 37.1% higher than single women and men, respectively. Further, except for widowed men group, the statistical tests on adjusted odds ratio for each of marital status category in the three datasets analyses were statistically significant at 5% level ( Table 3) because the corresponding 95% confidence intervals of the married, divorced and widowed groups for women and full datasets and married and divorced groups for men dataset do not include 1, whereas the confidence interval for widowed men group in mean dataset includes 1.
The odds of HIV infection of women patients who had primary, secondary and superior education levels were 1.633 (95% CI 1.286-2.074), 1.327 (95% CI 1.007-1.749) and 1.319 (95% CI 0.871-1.998) times higher, respectively than the odds of HIV infection of patients with no formal education but who share identical values on the covariates age, marital status, occupation, religion, place of residence and use of condom during sexual intercourse and who also share the same district average risk. Whereas for men patients, the odds 1.471 (95% CI 1.162-1.864), 1.125 (95% CI 0.854-1.482) and 1.158 (95% CI 0.767-1.748) times higher in those who were in primary, secondary and superior education levels compared with those who had no formal education. However, the result was statistically significant only for those in primary education level ( Table 3). For men patients who were farmers, the odds of HIV infection 1.446 (95% CI 1.053-1.987)

Bayesian approach
Additional file 1: Tables S2 to S4 report on the results of priors influence on model parameters (posterior means and standard deviations). The three priors of hyperparameter for variance of district-specific random effects yielded similar results for the regression coefficients in models fitted to women, men and full data sets. However, there were slight differences in the point estimates of district random effects variances. Moreover, there were variations in the posterior densities of precision τ b , where τ b = σ −1 b across three priors (see Fig. 1). The figure reveals that the Lunn et al. [29] prior was slightly more informative compared to Fong, Rue and Wakefield [30] prior but the inla default prior was the least informative.
Table4 displays the DIC and WAIC for the models fitted to the women, men and full data sets using different priors. The DIC and WAIC values of Lunn et al. prior were slightly smaller for models fitted to women and full data sets, however the DIC and WAIC values of Fong, Rue and Wakefield prior were slightly smaller for model fitted to men data than those of the inla default prior and the Lunn et al. prior ( Table 4). These results suggesting that the Lunn et al. prior to be a preferred prior for women and full data sets, whereas the Fong, Rue and Wakefield prior to be a preferred prior for the men data set. Therefore, in what follows, we only report results based on these priors for respective data.
The posterior means of the coefficients (Additional file 1: Tables S2 to S4) are very similar to the Adaptive Gaussian Quadrature or the maximum likelihood estimates ( Table 3), which is in line with the theory that non-informative priors should not have effect on the posterior. Table 5 displays adjusted odds ratios (aOR) and the corresponding 95% credible interval (CI) for covariates. Unlike the posterior means, the credible intervals for Bayesian analysis were different from the likelihood confidence intervals. The credible intervals provide a measure of uncertainty and since for the models fitted in this paper the posterior distributions of regression coefficients are symmetric these intervals are more robust than their likelihood counterpart [23]. Note in Table 5    In addition, women with primary, secondary and superior education levels were 63.7, 32.9 and 32.0% more likely to had HIV infection, respectively compared to those women with no formal education (aOR 1.637, 95% CI 1.290-2.079 for primary; aOR 1.329, 95% CI 1.009-1.753; and aOR 1.320, 95% CI 0.872-2.001). However, the increased in odds of HIV infection was statistically nonsignificant for superior education level. Among men patients, there were higher odds of HIV infection in primary (aOR 1.474, 95% CI 1.164-1.866), secondary (aOR 1.125, 95% CI 0.854-1.482) and superior (aOR 1.157, 95% CI 0.767-1.747) education levels relative to men with no formal education but the result was significant only for primary education level. Also, women daily laborers, farmers, government employees and merchants were 16.3, 8.1, 0.6 and 23.2% more at risk of HIV infection, respectively compared to those who were without job, whereas men patients in these categories were 24.6, 45.1 and 16.2 cents more at risk of HIV infection, respectively compared to those who were without job but the increase in odds of HIV infection was statistically significant only in men farmers (aOR 1.451, 95% CI 1.057-1.993). Further, for women the odds of HIV infection among orthodox and protestant were 10.7 and 30.1% lower than muslims (aOR 0.906, 95% CI 0.754, 1.090 for orthodox; and aOR 0.698, 95% CI 0.602, 0.809 for protestant), whereas for men the odds were 13 and 27.6% lower in orthodox and protestant relative to muslims (aOR 0.870, 95% CI 0.664-1.140 for orthodox and aOR 0.724, 95% CI 0.584-0.896 for protestant). For women who were urban residents compared to those who were in rural areas, the odds of HIV infection was 13.3% more (aOR 1.133, 95% CI 0.932-1.378), whereas for men urban residents the odds of HIV infection was 29.1% higher compared to rural residents (aOR 1.291, 95% CI 1.063-1.569). The results also showed that women who had sex without condom were 76.47 times more at risk of having HIV infection compared to those who used condom during sex (aOR 70.611, 95% CI 61.068, 81.872) and the risk was also very high for men who had sex without condom (aOR 68.848, 95% CI 56.093-84.997).
By transforming the quantiles of σ b , i.e. by exponentiating median and a 95% credible interval, one can easily interpret the standard deviation on an odds scale. For the results in Table 5, in odds scale median and a 95% credible interval for women are 1.372 and (1.199, 1.673), for men are 1.329 and (1.145, 1.642), respectively. This suggests that one standard deviation in district variation would multiply the odds of HIV infection for women on average by about 1.372, or equally possible divide them by 1.372, and the multiplication also would vary between 1.199 and 1.673 with 0.95 probability. Therefore, the variation in odds of HIV infection among districts was heterogeneous for both women and men in Jimma zone. Figure 2 shows the caterpillar plots with best linear unbiased prediction (BLUP) values of the districts random effects (left panel) and the posterior means (i.e., the dots in the right panels) and 95% credible intervals of the districts random effects (right panel). The blue dots in the left top and left bottom panels are the conditional modes with error bars. A negative BLUP and posterior mean values for a district associated with a lower HIV prevalence rate compared to the average HIV prevalence rate of Jimma zone while a positive value associated with a higher HIV prevalence rate. Therefore, the top plots both in the left and right panels suggest that Limu Kossa had significantly higher, whereas Jimma special town   compared to the average HIV prevalence rate of Jimma zone among men. Although, Omo Beyam and Omo Nada had large positive and small negative posterior means for men data, respectively their 95% credible intervals in Fig. 2d show that their HIV prevalence rates were not significantly higher or lower than the Jimma zone average, respectively because the zero line crosses the intervals. The negative and positive BLUP values (and posterior means) for district random effects also show heterogeneity of HIV infection among districts in Jimma zone.

Discussion
This study focused on analysis of HIV prevalence data collected from district clinics register in Jimma zone, Oromia region, Ethiopia using applications of likelihood and Bayesian techniques. The results from applying both techniques on full data show that the odds of HIV infection significantly associated with age, gender, marital status, education level, type of occupation, religion, place of residence and condom use during sex. These results are in agreement with previous studies [36][37][38]. The two methods of model fitting provide similar results for all components of the fixed effects β except the posterior mean of farmer where unlike the likelihood result, in the Bayesian method this posterior mean was statistically significant for men and full data sets. Generally, compared to the maximum likelihood method, the Bayesian method yields accurate estimation of variance components [39] but in small samples the results depend on the selected prior distribution for district variance and this variance is also susceptible to bias [28]. Since the number of districts, i.e., clusters is 22 and the number of patients per districts large (ranged from 147 to 1071), there was very small differences among the standard deviations of district random effects for the three precision priors.
In this study, we have found evidence after controlling for other covariates that, in Jimma zone, the odds of HIV infection was lower in women across all age groups compared with men relative to adults patients who were 50 years or older, this result was different from the findings of Amornkul et al. [37,38,40], where the findings in these studies show that females are more likely to test positive for HIV. The odds of HIV infection was very high in both gender among adolescents, i.e. in age group 15-19 years, this might be due to the fact that younger age of initial sexual activity is a risk factor for HIV infection among this age group [41]. However, results from this study show that marital status (married, divorced and widowed) was associated with increased odds of HIV infection, this agreed with findings reported in [13,36,42,43] but it differed from Tlou [44] findings. In each of marital status categories, women had a higher odds of HIV infection. The current result might be related to cultural practices like polygamy, which is allowed in some religion, but there is no published report or information on this related to the study area, or it might be due to low rates of condom use by husbands with their wives and may be due to high rates of extramarital sex by men [45]. The study found that patients with a low level of education generally had the highest odds of HIV infection. Although similar trends were found among women and men, women who were in primary, secondary and superior education levels had a higher risk of HIV infection than men in the same categories. However, unlike women, highly educated men were at more risk of HIV infection, this is in agreement with Chen et al. [38] finding, but the current result was not statistically significant.
Further, the study findings show that relative to patients with no job, those men who were farmers, government employees and daily laborers were at more risk of HIV infection than those women in the same categories. These might be because of the relationship between occupation and sexual behaviour that those with higher incomes are more likely to engage in extra relational sexual encounters [46] and daily laborers associated with higher sexual risk behaviours [47]. However, the current result was statistically significant only for men farmers, this might be explained from authors personal observation where in Ethiopia men farmers travels to urban areas either to sell their farm products and purchase seeds, fertilizers and other farm related equipments, and in most part of the country they gather in bars (which are located in urban areas) to have drinks, as these places usually have sex workers, the alcoholic beverages could make them very vulnerable and encourages them to had more multiple sexual partners. Further, the results revealed that non-muslim patients had a lower risk of HIV infection compared to muslim patients, this result different from Chen et al. [38], where in their study they have observed that nonmuslims are more likely to test positive. In addition, the study showed that patients residing in urban areas were at more risk of HIV infection than those in rural areas and the risk was higher for men than women and this result supports the Ethiopian Population-based HIV Impact Assessment report of Ethiopian Public Health Institute [6]. The study also reaffirmed condom use during sex is highly protective against HIV infection for both women and men.
The main strengths of this study is that it has managed to illustrate the risk of HIV infection between women and men after controlling for possible confounders. The disaggregation of HIV prevalence by gender highlights the persistent features of the epidemic in Jimma zone, for example men across all age groups had higher prevalence compared with women in the same age group and the HIV prevalence or odds of HIV infection among married, divorced and widowed women patients higher than men in the same marital status. In addition, the best linear unbiased predictors for district random effects helped to identify the districts that had significant higher or lower HIV prevalence rate than the Jimma zone average HIV prevalence rate. The district random effects represent the differences on patient's HIV test result attributable to the districts, but were not captured by any of the covariates. These may include whether the local government authorities introduced educational programmes to promote use of condom during sex and encouraging residents to know their HIV status or not. This analysis permits, for example, those districts which had higher than the Jimma zone average HIV prevalence rate to be targeted for further research in order to improve their intervention strategies. Despite these strengths, the study had some limitations. HIV infection prevalence associated with sexual behaviour characteristics such as age at first sex, condom use at first sex and number of sex partners, frequency of HIV testing, HIV knowledge and clinical characteristics, e.g., sexually transmitted infections [37,38]. However, these information were not introduced in the analysis because the clinic patient register data used in this study do not contain them. Furthermore, in this paper, our analysis of risk factors of HIV infection in Jimma zone was done using clinic register data, therefore the results should be interpreted with caution. The findings are limited to this area and not necessarily generalizable to the Oromia region or the country.